stochastic noise
AnImprovedAnalysisofGradientTracking forDecentralizedMachineLearning
Methods that train machine learning models on decentralized data offer many advantages over traditional centralized approaches in core aspects such as data ownership, privacy, fault tolerance and scalability [12,33]. Manycurrent efforts inthis direction come under the banner offederated learning [17, 29, 28, 12], where a central entity orchestrates the training and collects aggregate updates from the participating devices.
Evaluating the Impact of Radiographic Noise on Chest X-ray Semantic Segmentation and Disease Classification Using a Scalable Noise Injection Framework
Jiu, Derek, Nijjer, Kiran, Chinta, Nishant, Bui, Ryan, Zhu, Kevin
Deep learning models are increasingly used for radiographic analysis, but their reliability is challenged by the stochastic noise inherent in clinical imaging. A systematic, cross-task understanding of how different noise types impact these models is lacking. Here, we evaluate the robustness of state-of-the-art convolutional neural networks (CNNs) to simulated quantum (Poisson) and electronic (Gaussian) noise in two key chest X-ray tasks: semantic segmentation and pulmonary disease classification. Using a novel, scalable noise injection framework, we applied controlled, clinically-motivated noise severities to common architectures (UNet, DeepLabV3, FPN; ResNet, DenseNet, EfficientNet) on public datasets (Landmark, ChestX-ray14). Our results reveal a stark dichotomy in task robustness. Semantic segmentation models proved highly vulnerable, with lung segmentation performance collapsing under severe electronic noise (Dice Similarity Coefficient drop of 0.843), signifying a near-total model failure. In contrast, classification tasks demonstrated greater overall resilience, but this robustness was not uniform. We discovered a differential vulnerability: certain tasks, such as distinguishing Pneumothorax from Atelectasis, failed catastrophically under quantum noise (AUROC drop of 0.355), while others were more susceptible to electronic noise. These findings demonstrate that while classification models possess a degree of inherent robustness, pixel-level segmentation tasks are far more brittle. The task- and noise-specific nature of model failure underscores the critical need for targeted validation and mitigation strategies before the safe clinical deployment of diagnostic AI.
Multi-Objective Optimization of ReRAM Crossbars for Robust DNN Inferencing under Stochastic Noise
Yang, Xiaoxuan, Belakaria, Syrine, Joardar, Biresh Kumar, Yang, Huanrui, Doppa, Janardhan Rao, Pande, Partha Pratim, Chakrabarty, Krishnendu, Li, Hai
--Resistive random-access memory (ReRAM) is a promising technology for designing hardware accelerators for deep neural network (DNN) inferencing. We propose the design and optimization of a high-performance, area-and energy-efficient ReRAMbased hardware accelerator to achieve robust DNN inferencing in the presence of stochastic noise. We make two key technical contributions. First, we propose a stochastic-noise-aware training method, referred to as ReSNA, to improve the accuracy of DNN inferencing on ReRAM crossbars with stochastic noise. Second, we propose an information-theoretic algorithm, referred to as CF-MESMO, to identify the Pareto set of solutions to trade-off multiple objectives, including inferencing accuracy, area overhead, execution time, and energy consumption. The main challenge in this context is that executing the ReSNA method to evaluate each candidate ReRAM design is prohibitive. T o address this challenge, we utilize the continuous-fidelity evaluation of ReRAM designs associated with prohibitive high computation cost by varying the number of training epochs to trade-off accuracy and cost. CF-MESMO iteratively selects the candidate ReRAM design and fidelity pair that maximizes the information gained per unit computation cost about the optimal Pareto front. Our experiments on benchmark DNNs show that the proposed algorithms efficiently uncover high-quality Pareto fronts. On average, ReSNA achieves 2. 57% inferencing accuracy improvement for ResNet20 on the CIF AR-10 dataset with respect to the baseline configuration. Moreover, CF-MESMO algorithm achieves 90. Resistive random access memory (ReRAM) has emerged as a promising nonvolatile memory technology due to its multi-level cell, small cell size, and low access time and energy consumption. Prior work has shown that the crossbar structure of ReRAM arrays can efficiently execute matrix-vector multiplication [1], [2], the predominant computational kernel associated with deep neural networks (DNNs). ReRAM-based accelerators for fast and efficient DNN training and inferencing have been extensively studied [3]-[8]. However, a key challenge in executing DNN inferencing [9]- [11] on ReRAM-based architecture arises due to nonidealities of ReRAM devices, which can degrade the accuracy of inferencing.
FlowDPS: Flow-Driven Posterior Sampling for Inverse Problems
Kim, Jeongsol, Kim, Bryan Sangwoo, Ye, Jong Chul
Flow matching is a recent state-of-the-art framework for generative modeling based on ordinary differential equations (ODEs). While closely related to diffusion models, it provides a more general perspective on generative modeling. Although inverse problem solving has been extensively explored using diffusion models, it has not been rigorously examined within the broader context of flow models. Therefore, here we extend the diffusion inverse solvers (DIS) - which perform posterior sampling by combining a denoising diffusion prior with an likelihood gradient - into the flow framework. Specifically, by driving the flow-version of Tweedie's formula, we decompose the flow ODE into two components: one for clean image estimation and the other for noise estimation. By integrating the likelihood gradient and stochastic noise into each component, respectively, we demonstrate that posterior sampling for inverse problem solving can be effectively achieved using flows. Our proposed solver, Flow-Driven Posterior Sampling (FlowDPS), can also be seamlessly integrated into a latent flow model with a transformer architecture. Across four linear inverse problems, we confirm that FlowDPS outperforms state-of-the-art alternatives, all without requiring additional training.
Explicit and Implicit Graduated Optimization in Deep Neural Networks
Graduated optimization is a global optimization technique that is used to minimize a multimodal nonconvex function by smoothing the objective function with noise and gradually refining the solution. This paper experimentally evaluates the performance of the explicit graduated optimization algorithm with an optimal noise scheduling derived from a previous study and discusses its limitations. It uses traditional benchmark functions and empirical loss functions for modern neural network architectures for evaluating. In addition, this paper extends the implicit graduated optimization algorithm, which is based on the fact that stochastic noise in the optimization process of SGD implicitly smooths the objective function, to SGD with momentum, analyzes its convergence, and demonstrates its effectiveness through experiments on image classification tasks with ResNet architectures.
Finite-Time Analysis of Simultaneous Double Q-learning
Q-learning is one of the most fundamental reinforcement learning (RL) algorithms. Despite its widespread success in various applications, it is prone to overestimation bias in the Q-learning update. To address this issue, double Q-learning employs two independent Q-estimators which are randomly selected and updated during the learning process. This paper proposes a modified double Q-learning, called simultaneous double Q-learning (SDQ), with its finite-time analysis. SDQ eliminates the need for random selection between the two Q-estimators, and this modification allows us to analyze double Q-learning through the lens of a novel switching system framework facilitating efficient finite-time analysis. Empirical studies demonstrate that SDQ converges faster than double Q-learning while retaining the ability to mitigate the maximization bias. Finally, we derive a finite-time expected error bound for SDQ.
mL-BFGS: A Momentum-based L-BFGS for Distributed Large-Scale Neural Network Optimization
Niu, Yue, Fabian, Zalan, Lee, Sunwoo, Soltanolkotabi, Mahdi, Avestimehr, Salman
Quasi-Newton methods still face significant challenges in training large-scale neural networks due to additional compute costs in the Hessian related computations and instability issues in stochastic training. A well-known method, L-BFGS that efficiently approximates the Hessian using history parameter and gradient changes, suffers convergence instability in stochastic training. So far, attempts that adapt L-BFGS to large-scale stochastic training incur considerable extra overhead, which offsets its convergence benefits in wall-clock time. In this paper, we propose mL-BFGS, a lightweight momentum-based L-BFGS algorithm that paves the way for quasi-Newton (QN) methods in large-scale distributed deep neural network (DNN) optimization. mL-BFGS introduces a nearly cost-free momentum scheme into L-BFGS update and greatly reduces stochastic noise in the Hessian, therefore stabilizing convergence during stochastic optimization. For model training at a large scale, mL-BFGS approximates a block-wise Hessian, thus enabling distributing compute and memory costs across all computing nodes. We provide a supporting convergence analysis for mL-BFGS in stochastic settings. To investigate mL-BFGS potential in large-scale DNN training, we train benchmark neural models using mL-BFGS and compare performance with baselines (SGD, Adam, and other quasi-Newton methods). Results show that mL-BFGS achieves both noticeable iteration-wise and wall-clock speedup.